Dose–response prediction for in-vitro drug combination datasets: a probabilistic approach

In this paper we propose PIICM, a probabilistic framework for dose–response prediction in high-throughput drug combination datasets. PIICM utilizes a permutation invariant version of the intrinsic co-regionalization model for multi-output Gaussian process regression, to predict dose–response surfaces in untested drug combination experiments. Coupled with an observation model that incorporates experimental uncertainty, PIICM is able to learn from noisily observed cell-viability measurements in settings where the underlying dose–response experiments are of varying quality, utilize different experimental designs, and the resulting training dataset is sparsely observed. We show that the model can accurately predict dose–response in held out experiments, and the resulting function captures relevant features indicating synergistic interaction between drugs. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-023-05256-6.


S2 Proof of Proposition 1
In this section we prove Proposition 1 from the main paper. The proof hinges on some properties of matrices that can be written in a specific way, namely as S = (A + QAQ) + (QA + AQ), where Q is a symmetric permutation matrix and A a positive semi-definite (PSD) matrix. The following lemma will be of use in the proof ii. S, M 1 and M 2 have the same eigenvectors. Moreover, if v is an eigenvector with eigenvalue λ 1 for M 1 and λ 2 for M 2 , it will have eigenvalue λ 1 + λ 2 for S.
v. S is PSD. Furthermore, let σ(S) + denote the set of all positive eigenvalues of S, then Proof.
i) Trivial using properties of symmetric permutation matrices, Q = Q −1 = Q T .
ii) Since both M 1 and M 2 are symmetric, they are diagonalizable, i.e.
where V i is the matrix whose columns are eigenvectors for M i and D i is a diagonal matrix containing the corresponding eigenvalues. Furthermore, it is easy to check that the matrices M 1 and M 2 commute, i.e. M 1 M 2 = M 2 M 1 and hence they are simultaneously diagonalizable [1, Theorem 1.3.12]: The result now follows.
Suppose λ = 0. Since v is the corresponding eigenvector, Sv = λv and hence QSv = λQv. Using that QS = SQ we obtain SQv = λQv, hence either Qv = 0 (ruled out since v is an eigenvector) or Qv is an eigenvector of S with eigenvalue λ. Now, since both SQv = λQv and Sv = λv, we can write Since λ = 0 by assumption, this implies that Qv − v = 0. Now since M 1 and M 2 have the same null space (since M 1 = QM 2 ), both λ 1 and λ 2 have to be non-zero, and hence by (iii) λ 1 = λ 2 .
v) Since S and M 2 have common eigenvectors according to (ii), it suffices to show for any eigenvector v how each corresponding eigenvalue of M 2 is related to the corresponding eigenvalue of S. Let v denote an eigenvector, with eigenvalue λ for S, λ 1 for M 1 and λ 2 for M 2 . There are three cases to consider.
Thus for any eigenvector v of S, the corresponding eigenvalue is either zero or 2λ 2 if λ 2 is positive, this S is PSD and the result follows.
Proposition 1. Let K c , K d and K x denote positive semi-definite matrices, while P andP denote symmetric permutation matrices. Furthermore, let σ(A) + denote the collection of positive eigenvalues of the matrix A. Then, the matrix is positive semi-definite (PSD) and Proof. First note that since the eigenvalues of a Kronecker product A ⊗ B is simply the product of all eigenvalues of A by all eigenvalues of B, it suffices to prove that the eigenvalues of are exactly the positive eigenvalues of (P K d + K d P ) ⊗P K x . Now, the proposition can be proved by simply showing that S can be written such that Lemma 1 applies. By using the mixed-product property of Kronecker products, (A ⊗ B)(C ⊗ D) = (AC) ⊗ (BD), and the identities where Q = (P ⊗P ) is a symmetric permutation matrix and C = ( Hence the result follows from Lemma 1(v).

S3 Full model description for single experiment dose-response
In this section, we provide the full model description used to fit each individual experiment to the doseresponse model in section ??. The model is provided solely for reference, and we refer to the original paper [2] for an extensive discussion of the model. The bayesynergy model takes as takes as input cell viability measurements y = (y 1 , . . . , y n ), measured at a corresponding set of drug concentrations X = (x 1 , . . . , x n ). Typically, drug combination experiments are performed on a grid of concentrations, and we could assume that X = X 1 × X 2 , where X 1 = (x 1 , . . . , x n1 ) denote the n 1 concentrations of drug 1 and X 2 = (x 2 , . . . , x n2 ) the n 2 concentrations of drug 2. The experiment may also include technical replicates, where cell viability have been measured multiple times at each drug concentration. We can assume that the dataset contains n rep technical replicates, and thus the measured viabilities can be denoted by y ijr where i = 1, . . . , n 1 , j = 1, . . . , n 2 and r = 1, . . . , n rep . A heteroskedastic noise model is assumed, that depends on the value of the dose-response function f at the corresponding concentration, and the GP prior is given a Matérn-3/2 covariance function. The complete model description is as follows . . , n 1 , j = 1, . . . , n 2 , r = 1, . . . , n rep , σ ∼ Inv-Gamma (5, 1) , λ = 0.005.
Note that the implementation of this model in the bayesynergy R package does not need a complete grid of observations, nor a complete set of replications. It takes as input a set of cell viability measurements y and the corresponding drug concentrations X and provides samples from the posterior distribution of the model at each drug concentration x in X.
We use this approach to fit each dose-response experiment, and draw samples from the posterior distribution of the latent GP z(x), and the dose-response function f (x), at drug concentration x. If samples are needed at a location x * not observed in the original dataset, z(x * ) and f (x * ) are sampled from the posterior predictive distribution. From these samples, we compute posterior means and variances, which are used to construct the matrices Z, S for the latent GP, and F and S F for the dose-response function.
Additionally, we use a simplified version of this model to estimate the individual monotherapies for each drug on every cell line. These are used to construct the estimatep 0 * in equation (??) of the main text. Given a monotherapy dataset of cell viabilities y = (y 1 , . . . , y n ) measured at concentrations X = (x 1 , . . . , x n ), we estimate the following model for each drug combination on every cell line. For any two drugs on any cell line, we can then compute a sample from the predictive posterior distribution of p 0 (x * ) by first sampling the individual monotherapy evaluations h 1 (x 1 * ) and h 2 (x 2 * ) from their individual posterior predictive distributions and constructing p 0 (x * ) = h 1 (x 1 * )h 2 (x 2 * ). The estimatep 0 * can then be computed by averaging across all samples.

S4 Comparison with IDACombo
We include here a short comparison of the PIICM modelling framework with IDACombo [3]. The IDACombo framework is based on the notion of Independent Drug Action (IDA), which hypothesises that most drugs act independently of each other and do not interact. In fact, the IDA principle states that the expected effect of a drug combination is exactly equal to the effect of the single most effective drug of the combination administered on its own. This principle is sometimes referred to as the highest single agent (HSA) in synergy modelling. Predictions with IDACombo for drug combination (A, B) at concentration x = (x A , x B ) is made by first finding the minimum of the two monotherapy functions for the drugs, i.e. min{h Then this is done for each of the i = 1, . . . , n c cell lines in the training dataset and these are averaged to produce a final prediction of expected dose-response, i.e.: where the monotherapy functions h A,i and h B,i denote the monotherapy functions of drugs A and B on the ith cell line, respectively. In many ways, this is similar to an underlying assumption of the PIICM, where we model dose-response as a two-component function, f (x) = p 0 (x) + ∆(x), where f denotes the dose-response function taking as arguments the drug concentrations x. The non-interaction component, p 0 , captures our prior expectation regarding the behaviour of f , and while we prefer the Bliss non-interaction assumption for this -we could have just as easily chosen something else, like IDA. The interaction component, ∆, is modelled as a zeromean GP pushed through a transformation function g that has the property that g(0) = 0. In terms of prior expectation, we have that E [f ] ≈ p 0 , for all x. The PIICM is constructed to model everything in p 0 as a function of the monotherapy measurements, and focuses its attention on modelling whatever is left over by the multi-output GP. If the model becomes uncertain, or the GP is extrapolating outside the data range, the behaviour of the GP is such that predictions made with PIICM will be shrunken towards p 0 .
The key difference in the two frameworks is that PIICM is focused on predicting the deviations from a non-interaction assumption, while IDACombo produces predictions that are equal to a non-interaction assumption. Another key difference is that IDACombo only produces an averaged predicted response, across multiple cell lines. This means that for a given drug combination and drug concentration, IDACombo will predict the same response value for all cell lines in the test dataset, while PIICM (and both comboFM and comboLTR) produce cell-line specific predictions.
While we can compare predictions made by IDACombo to PIICM or comboLTR, we should note that such a comparison is not fair to IDACombo. PIICM and comboLTR are both trained on actual combination data, and not only monotherapy measurements. In fact, it would be more sensible to compare IDACombo against simply the baseline Bliss modelp 0 utilized in the main results section. We show the results of such a comparison in Figure S1, which is a version of Figure 6 with predictions made by IDACombo. Using a similar approach as in the original IDACombo paper, we only evaluate the performance of the models at the maximum concentration tested for each drug in the combination. When predicting raw viability measurements, IDACombo performs poorly compared to both PIICM and the Bliss baseline model, with Pearson's correlation between observed and predicted at only 0.60 for IDACombo, compared to 0.83 for the baseline and 0.88 for PIICM. The poorer performance compared to the baseline is mainly due to the aforementioned property that IDACombo does not produce cell-line specific predictions. Following further the IDACombo paper, which report the average performance across cell lines, we compute the average of our predictions and observed viability measurements across all the cell lines in the test set. The results are visualized in Figure S2, which again shows the poorer performance of IDACombo compared to both the baseline and the PIICM model. Pearson's correlation between average observed and average predicted is 0.48 for IDACombo, 0.92 for the baseline, and 0.97 for PIICM. Again, this is mostly due to IDACombo not supplying a cell-line specific prediction, but rather a single number representing an average dose response across cell lines. Given that synergy is quite rare, this averaged prediction might be an okay approximation to the average dose-response when the number of cell lines become large (as most cell lines will tend to follow e.g. the IDA principle), however it does not perform well in a small sample train-test split setting as here.
Figure S1: The figure shows predicted versus observed viability for the baseline, PIICM and the IDACombo models. Each dot in the figure represents a unique (cell line, drug A, drug B, conc. drug A, conc. drug B)-quintuplet, and is colored according to the density of points. Furthermore, the comparison has only been done using the maximum concentration for each drug in a combination. Overall, both the baseline model and PIICM have a comparatively large advantage over IDACombo in terms of predictive performance. Figure S2: The figure shows the average predicted versus average observed viability for the baseline, PIICM and the IDACombo models. The average has been taken across the cell lines, and hence each dot in the figure represents a unique (drug A, drug B, conc. drug A, conc. drug B)-quadruplet, colored according to the density of points. We see the same pattern as before, with both the baseline model and PIICM outperforming IDACombo in terms of predictive performance.